Handbook of Big Data by unknow
Author:unknow
Language: eng
Format: epub
Published: 2016-01-21T11:00:08+00:00
208
Handbook of Big Data
12.6.5
Complexities with Power-Law Graphs
Power-law, or highly skewed, degree distribution pose problems for efficient implementations
of generalized matrix-vector products at scale. The PowerGraph extension of GraphLab [22]
contains a number of ideas to improve performance with formal reductions and power-law
graphs based on vertex splitting ideas that create virtual copies of vertices with lower degree.
These large degree vertices, however, tend not to prohibit implementations of these ideas
on large graphs and only make them slower.
12.6.6
Implementations in SQL
Generalized matrix-vector products are possible to implement even in traditional database
systems that implement SQL [74]. Using a distributed SQL database such as Greenplum will
evaluate these tasks with reasonable efficiency even for large graphs. As an example, here we
illustrate how to perform the generalized matrix-vector product for connected components
in SQL. The graph is stored as a table that provides edge-list access. ∗ The columns head
and tail indicate the start and end of each edge. The initial vector is stored as a table with
a vertex id and the unique id associated with it (which could be the same).
edges : id | head | tail
x : id | comp
In the iteration, we create the vector x(next) from x:
CREATE TABLE xnext AS (
SELECT e.tail AS id, MIN(x.comp) AS comp
FROM edges e INNER JOIN x ON e.head = x.id
GROUP BY e.tail );
This query takes the graph structure, joins it with the vector such that each component of
the table x is mapped to the head of each edge. Then we group them by the tail of each
edge and take the MIN function over all components. This is exactly what the iteration in
the connected components example did.
12.7
Limitations and Tradeoffs in Mining Large Graph
In many cases, individuals who employ graph mining tools want to obtain some sort of
qualitative understanding of or insight into their data. This soft and subtle goal differs in
important ways from simply using the graph to obtain better prediction accuracy in some
well-defined downstream machine learning task. In particular, it can differ from the use of
the algorithms and techniques we have focused on in the previous sections, for example,
when those algorithms are used as black-box components in larger analytics frameworks such
as various machine learning pipelines that are increasingly common. A common temptation
in this setting is to use the intuition obtained from mining small graph, assuming or hoping
that the intuition thereby obtained is relevant or useful for much larger graphs. In general,
it is not. Using intuition obtained from small graphs can lead to qualitatively incorrect
understanding of the behavior and properties of larger graphs; it can lead to qualitatively
incorrect understanding of the behavior of algorithms that run on larger graphs.
∗ We assume that the graph structure is undirected, so both edges ( u, v) and ( v, u) are stored, and that
each vertex has a self-loop.
Download
This site does not store any files on its server. We only index and link to content provided by other sites. Please contact the content providers to delete copyright contents if any and email us, we'll remove relevant links or contents immediately.
The Mikado Method by Ola Ellnestam Daniel Brolund(27095)
Hello! Python by Anthony Briggs(25950)
Secrets of the JavaScript Ninja by John Resig Bear Bibeault(25286)
Kotlin in Action by Dmitry Jemerov(24396)
The Well-Grounded Java Developer by Benjamin J. Evans Martijn Verburg(23591)
Dependency Injection in .NET by Mark Seemann(23313)
OCA Java SE 8 Programmer I Certification Guide by Mala Gupta(21945)
Algorithms of the Intelligent Web by Haralambos Marmanis;Dmitry Babenko(20849)
Grails in Action by Glen Smith Peter Ledbrook(19869)
Adobe Camera Raw For Digital Photographers Only by Rob Sheppard(17073)
Sass and Compass in Action by Wynn Netherland Nathan Weizenbaum Chris Eppstein Brandon Mathis(16833)
Secrets of the JavaScript Ninja by John Resig & Bear Bibeault(14464)
Test-Driven iOS Development with Swift 4 by Dominik Hauser(12584)
Jquery UI in Action : Master the concepts Of Jquery UI: A Step By Step Approach by ANMOL GOYAL(11865)
A Developer's Guide to Building Resilient Cloud Applications with Azure by Hamida Rebai Trabelsi(10650)
Hit Refresh by Satya Nadella(9239)
The Kubernetes Operator Framework Book by Michael Dame(8588)
Exploring Deepfakes by Bryan Lyon and Matt Tora(8446)
Robo-Advisor with Python by Aki Ranin(8390)